10 research outputs found
Real-time Person Re-identification at the Edge: A Mixed Precision Approach
A critical part of multi-person multi-camera tracking is person
re-identification (re-ID) algorithm, which recognizes and retains identities of
all detected unknown people throughout the video stream. Many re-ID algorithms
today exemplify state of the art results, but not much work has been done to
explore the deployment of such algorithms for computation and power constrained
real-time scenarios. In this paper, we study the effect of using a light-weight
model, MobileNet-v2 for re-ID and investigate the impact of single (FP32)
precision versus half (FP16) precision for training on the server and inference
on the edge nodes. We further compare the results with the baseline model which
uses ResNet-50 on state of the art benchmarks including CUHK03, Market-1501,
and Duke-MTMC. The MobileNet-V2 mixed precision training method can improve
both inference throughput on the edge node, and training time on server
reaching to 27.77fps and , respectively and decreases
power consumption on the edge node by , while it deteriorates
accuracy only 5.6\% in respect to ResNet-50 single precision on the average for
three different datasets. The code and pre-trained networks are publicly
available at https://github.com/TeCSAR-UNCC/person-reid.Comment: This is a pre-print of an article published in International
Conference on Image Analysis and Recognition (ICIAR 2019), Lecture Notes in
Computer Science. The final authenticated version is available online at
https://doi.org/10.1007/978-3-030-27272-2_
ATRW: A Benchmark for Amur Tiger Re-identification in the Wild
Monitoring the population and movements of endangered species is an important
task to wildlife conversation. Traditional tagging methods do not scale to
large populations, while applying computer vision methods to camera sensor data
requires re-identification (re-ID) algorithms to obtain accurate counts and
moving trajectory of wildlife. However, existing re-ID methods are largely
targeted at persons and cars, which have limited pose variations and
constrained capture environments. This paper tries to fill the gap by
introducing a novel large-scale dataset, the Amur Tiger Re-identification in
the Wild (ATRW) dataset. ATRW contains over 8,000 video clips from 92 Amur
tigers, with bounding box, pose keypoint, and tiger identity annotations. In
contrast to typical re-ID datasets, the tigers are captured in a diverse set of
unconstrained poses and lighting conditions. We demonstrate with a set of
baseline algorithms that ATRW is a challenging dataset for re-ID. Lastly, we
propose a novel method for tiger re-identification, which introduces precise
pose parts modeling in deep neural networks to handle large pose variation of
tigers, and reaches notable performance improvement over existing re-ID
methods. The dataset is public available at https://cvwc2019.github.io/ .Comment: ACM Multimedia (MM) 202
Person Re-identification with Deep Similarity-Guided Graph Neural Network
The person re-identification task requires to robustly estimate visual
similarities between person images. However, existing person re-identification
models mostly estimate the similarities of different image pairs of probe and
gallery images independently while ignores the relationship information between
different probe-gallery pairs. As a result, the similarity estimation of some
hard samples might not be accurate. In this paper, we propose a novel deep
learning framework, named Similarity-Guided Graph Neural Network (SGGNN) to
overcome such limitations. Given a probe image and several gallery images,
SGGNN creates a graph to represent the pairwise relationships between
probe-gallery pairs (nodes) and utilizes such relationships to update the
probe-gallery relation features in an end-to-end manner. Accurate similarity
estimation can be achieved by using such updated probe-gallery relation
features for prediction. The input features for nodes on the graph are the
relation features of different probe-gallery image pairs. The probe-gallery
relation feature updating is then performed by the messages passing in SGGNN,
which takes other nodes' information into account for similarity estimation.
Different from conventional GNN approaches, SGGNN learns the edge weights with
rich labels of gallery instance pairs directly, which provides relation fusion
more precise information. The effectiveness of our proposed method is validated
on three public person re-identification datasets.Comment: accepted to ECCV 201
Tracking Multiple People Online and in Real Time
Abstract. We cast the problem of tracking several people as a graph partitioning problem that takes the form of an NP-hard binary integer program. We propose a tractable, approximate, online solution through the combination of a multi-stage cascade and a sliding temporal window. Our experiments demonstrate significant accuracy improvement over the state of the art and real-time post-detection performance.
Tracking social groups within and across cameras
We propose a method for tracking groups from single and multiple cameras with disjoint fields of view. Our formulation follows the tracking-by-detection paradigm where groups are the atomic entities and are linked over time to form long and consistent trajectories. To this end, we formulate the problem as a supervised clustering problem where a Structural SVM classifier learns a similarity measure appropriate for group entities. Multi-camera group tracking is handled inside the framework by adopting an orthogonal feature encoding that allows the classifier to learn inter- and intra-camera feature weights differently. Experiments were carried out on a novel annotated group tracking data set, the DukeMTMC-Groups data set. Since this is the first data set on the problem it comes with the proposal of a suitable evaluation measure. Results of adopting learning for the task are encouraging, scoring a +15% improvement in F1 measure over a non-learning based clustering baseline. To our knowledge this is the first proposal of this kind dealing with multi-camera group tracking
Appearance features for online multiple camera multiple target tracking
International audienceMultiple object tracking methods in the state-of-the-art are challenged by appearance variation, environment changes and longterm occlusions. Exploiting multiple calibrated and frame synchronized cameras holds the promise of alleviating these problems, in particular, the one pertaining to occlusion. The practical realization of this idea faces the problem that the appearance of the same target can change through different cameras. Thus, particular care should be taken in order to enhance the computation of appearance distances between targets in multiple cameras. In this paper, we tackle the problem of multiple object multiple camera tracking by adopting a Markov Decision Process framework. We concentrate on the effect of the affinity function by discussing different possible implementations and validating their performance, in terms of the MOT metric and the ID measure, on the PETS 2009 and EPFL datasets. Our experimental result shows a significant improvement of multiple cameras approaches with a sufficiently large overlapping zone compared to single camera ones